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1.   INTRODUCTION 

Ratio  estimators  have  been  used  quite  extensively  in  sajnple  surveys, 
not  only  as  estimators  of  population  ratios,  but  eis  estimators  of  population 
means  and  totals.  In  the  latter  case  they  involve  the  use  of  an  extra 
variable,  correlated  with  the  veiriable  of  interest.  These  ratio  estimators, 
although  known  to  be  bieised,  have  often  been  preferred  over  the  traditional 
unbiased  mean  per  unit  estimator,  since  it  has  been  demonstrated  that  in  a 
great  many  situations  the  ratio  estimator  has  a  smaller  variance.  A  major 
drawback  to  the  ratio  estimator  is  the  fact  that  it  is  biased,  although  in 
large  samples  it  has  been  demonstrated  that  the  bias  is  negligible.  In 
very  small  samples,  or  even  moderate  samples  from  a  stratified  population, 
no  really  convincing  argument  has  been  given  for  the  negligibility  of  the 
bias,  since  no  exact  expression  for  it  is  available.  Several  authors  have 
avoided  this  question  of  bias  by  developing  methods  which  eliminate  the 
bias  while  retaining  the  essential  properties  of  a  ratio  estimator. 

This  paper  reviews  the  usual  ratio  estimator,  giving  optimiun  conditions 
for  its  use.  The  bias  is  approximated  and  limits  for  the  bias  are   given, 
€is  well  as  cases  that  might  arise  in  which  the  bias  might  become  an 
important  factor.  Methods  are  then  considered  which  give  rise  to  reduced 
bias  estimators,  as  well  as  unbiased  ratio-type  estimators.  The  latter 
is  divided  into  two  major  classes  of  development,  (l)  the  elimination  of 
bieis  through  the  use  of  commonly  used  sampling  schemes,  and  (2)  the 
elimination  of  bias  through  the  use  of  certain  modifications  of  sanqpling 
schemes  making  the  usueil  biased  estimator  unbiased. 


2.   THE  BIASED  RATIO  ESTIMATOR 

The  classic  estimator  for  a  population  mean,  7,  or  population  total, 
Y,  has  been  the  sample  mean,  y,  and  inflated  sanqjle  mean,  N  y,  where  N  is 
the  finite  population  size.   In  the  past  qviarter  of  a  century,  the  ratio 
estimator,  using  a  variable  x  correlated  with  the  variable  of  interest 
y  to  estimate  population  means  and  totals,  has  come  into  prominence, 
especieaiy  in  surveys.  The  usual  simple  ratio  estimator  is 

the  ratio  of  the  two  sample  means.  Corresponding  estimators  of  the  pop- 
vilation  mean  and  total  are;  respectively, 


and 


\    =  y/x  •  X 
1 


^1 


where  X  is  the  population  total  of  the  x  values.  It  is  noted  that 
except  for  estimating  the  ratio,  the  population  total  of  the  x  values 
has  to  be  known. 

Although  these  estimators  are  known  to  be  biased  except  in  certain 
situations,  it  is  very  common  in  practice,  that  they  have  a  smaller  variance 
than  those  based  on  the  mean  per  unit  estimator, 


y  = 


n 
n 


Cochran  (2)  explains,  that  for  large  samples,  if  the  correlation  coef- 
ficient between  y  and  x  is  greater  than  one  half  of  the  coefficient  of 
variation  of  x  divided  by  the  coefficient  of  variation  of  y,  the  ratio 
estimate  of  Y  has  a  smaller  variance  than  the  simple  expansion  method,  N  y. 


This  occurs  very  often  in  survey  practice.  One  of  the  common  uses  of 
ratio  estimators  is  when  x.   is  the  value  y.   at  some  previous  time, 
and  here  the  two  coefficients  of  variation  may  be  about  equal.   If  the 
coefficient  of  correlation  is  greater  than  0.5,  in  this  case,  the  ratio 
estimate  is  superior. 

Cochran  (2)  also  applied  the  Gauss-Markov  Theorem  to  show  that  if  the 
regression  of  y  on  x  is  a  straight  line  through  the  origin,  and  the 
variance  of  y   about  this  line  is  proportional  to  x. ,  then  the  ratio 
estimate  is  a  minimum- variance  unbiased  estimator.   It  is  also  known  that 
in  l«u:ge  samples  the  distribution  of  R  ,  the  simple  ratio  estimator, 
tends  to  a  normal  distribution,  ajid  since  the  bias  is  of  order  1/n,  the 
bias  tends  to  zero. 

There  are  cases  when  the  existence  of  a  bias  becomes  an  importajit 
factor.   Goodmam  and  Hartley  (7)  state  there  is  one  very  important  class 
of  surveys  in  which  the  bias  may  become  of  vital  interest.  This  arises 
when  drawing  small  samples  from  a  large  number  (k)  of  strata.   It  often 
occurs  in  sampling,  that  the  bias  in  each  sample  will  be  of  the  same  sign, 
therefore  the  bias  in  the  estimate  of  the  population  total  will  be  k 
times  the  bias  for  a  stratum  total.  Since  the  variance  only  multiplies 
by  k,  the  mean  square  error  of  the  estimate  of  the  popvilation  total  will 
be  of  order  of  magnitude  k  ,  whereas  if  unbiased  the  order  of  magnitude 
would  be  k.   It  is  evident  that  an  xmbiased  ratio  estimator  in  this  case 
would  be  of  great  adveintage.  Lahiri  (l8)  emphasizes  particularly  the  risk 
involved  in  using  the  usual  (biased)  ratio-estimator  in  small  samples  from 
many  strata,  so,  since  no  such  risk  is  involved  in  the  unbiased  ratio-type 
estimators,  it  is  easily  seen  that  more  extensive  stratification  is  possible. 


Devices  for  reducing  ernd  eliminating  the  bias  have  mostly  been  developed 
since  the  early  1950 's.  Although  many  of  the  estimators  arrived  at  seem 
very  burdensome  to  calculate,  this  seems  like  an  unimportant  objection  to 
their  use,  since  much  survey  work  is  being  done  by  computers. 

3.   THE  ALMOST  UNBIASED  RATIO-TYPE  ESTIMATOR 

3.1.  Early  Work 
Since  the  bias  in  the  usual  ratio  estimator 

\   =  y/x 

is,  essentially,  the  product  of  two  random  variables,  the  exact  expression 
for  the  bias  cannot  be  obtained  in  a  straightforward  manner.  The  first 
practical  method  proposed  for  finding  the  bias  used  a  Taylor's  series 
expansion. 

Ri  -  R  =  y/x  -  R  =  2^^-^ 

X 

=  y  -  Rx  ^  X  ^  y  -  Rx  ^  -  /'___J^___i 
X      X     X         X  +  (x  -  X) 


X  X 


X         X      x'^ 

Cochran  (2)  used  the  above  expression  to  find  the  leading  term  in  the 
bias ,  which  is 


iL^L^     (rV(x)  -  C(x,y)) 


where 

V(x)  is  the  population  variance  of  x 
C(x,y)  is  the  population  covariance  of  x  and  y. 
These  results  were  sometimes  used  to  obtain  checks  on  the  size  of  the  bias 
in  a  specific  sample  by  substituting  sample  values,  but  until  1951,  no 
serious  thought  was  given  to  finding  unbiased  estimators  of  the  ratio-type. 

3.2.  Koop's  Estimator 
Koop  (IT)  in  1951,  found  Taylor's  theorem  to  be  an  unsatisfactory 
method  of  expansion  to  find  the  bias  of  the  simple  ratio  estimator  since  it 
uses  the  fact  that  R   is  differentiable  near  (Y,  X).   Since  R^  is  not 
continuous,  it  is  therefore  not  differentiable.  Koop  (IT)  obtained  an 
expression  for  the  bias  by  using  a  binomial  series  expansion,  then  sub- 
stituted sample  values  in  the  expression  for  the  bias,  reducing  it  to  any 
desired  degree.  The  following  estimator  due  to  Koop  is  xinbiased  to  order 
1/n^. 


p    -/-   w   rS^(^)   !llil£lwN-n)   w2.!l2^   ^03^^'^^  (n-n)(N-2n) 
Rg  =  y/x  -  1/n  (^^  -   _  _   j  (^J-  1/n  [-TZ^ 13 ]   ^I^^ 

X       X  y  y  X       X 


0 

3(n-l)  (i£U})l       ^ii^y'^^^  ^^\  N(N-n)(N-n-l) 
^3^-2    "      -  -3    -'  (n-l)(N-2)(N-3) 


w  3  r!ol4^I:^   ^13^^*''\  (N-n)(N^-6Nn  •>-  N  -t-  6n^; 
-  1/n   I  _4     -   -  -3  ^    (N-l)(N-2)(N-3) 


where 


„      n  (x  -  x) 
1=1 


-.2 


S%)   -  ^ 


I  iy,  -  y)' 


n-l 


I  (y,  -  y)'(x^  -  x)^ 

„     ,       N   k=l  ^ ^ 


This  formula  was  admitted  by  Koop  to  be  a  clupisey  and  crude  method  having 
possibly  large  sampling  errors.  However,  the  method  used  to  obtain  this 
estimate  is  of  theoretical  interest.  Koops  procedure  was  as  follows: 


where 


n     n 

y/^  =  h.  I  Ix 

1       1  ^ 


N     N-n      N     N-n 

=  [h^  -  I  yj  /  [l\  -  I  yj 


_  NX  -  (N-n)y' 
NX  -  (N-n)x' 


N-n 

I  yv 

y'  =  

N-n 


Koop  (17)  states  the  conditions  that  must  be  satisfied  to  expand 

(1  -  (~f7~)  — J        as  a  binomial  series  and  shows  the  conditions  are 
X 

satisfied.  For  an  exact  proof,  see  Koop  (17) •  He  mentions  another  method 

for  finding  the  bias  which  involves  writing 

y/x  =  y/x  (1  +  ^-^)  (1+^^^)"'' 

and  finding  its  expected  value  by  the  expansion  of  the  last  term  by  a 
binomial  series.  This  expansion  resulted  in  the  same  expression  for  the 
bias  as  the  previous  method. 

3.3.  Quenouille's  Estimator 

Quenouille  (25)  in  1956,  developed  a  method  for  reducing  bias  in  a 

large  class  of  estimators.  He  considered  the  general  problem  of  estimating 

an  vinknown  parameter  T,  from  a  function  t  (x,  ,  x_,  ....  x  )  of  a  series  of 

n  1   2'    '  n 

observations  taken  in  random  order,  the  estimator  can  often  be  written  as 

a  fvmction  of  the  unbiased  estimates  of  the  cumulants,  k. ,  k.,  ...,  k  . 

1   2        m   ■  ■, 

Quenouille  noted  that  the  moments  of  the  estimates  of  the  c\imvilants  are 

power  series  in  1/n  and  therefore  the  bias  in  t   could  be  expressed  as 

a  power  series  in  1/n,  if  the  following  conditions  hold: 

(1)  m       is   independent   of     n 

(2)  t       can  be  exi)anded  by  a  Taylor's  series 

(3)  t        is   consistent 

n 

If  the  above  conditions  hold,  then 
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E(T3ias)  =  a.  ,  +  a^ ,  o  +  •  •  • 
1/n    2/n'' 


If  one  considers  an  estimator  t ' , 

n 


where 


f  =  nt  -  (n-l)t  ,  , 
n     n        n-1* 


then 


A.    "t*  fl. 

^(^;^  =  "^  -  Vn^  - -^-y^  -  ••• 

n 

and  therefore  t'  is  unbiased  to  order  l/n^^  See  Quenouille  (25)  for 

proof  of  this.  Also  t", 

n 

where  .   ' 

n^  t '  -  (n-1)^  t'  ^ 
.11  _     n n-1 

n  "      2   ,   - v2 
n  -  (n-1) 

is  biased  to  order  1/n^  only,  and  so  on.  He  also  stated  that  any  subset 

of  the  observations  may  be  used  to  correct  for  bias.  Another  result  was 

the  estimator  t ' 
2p 


*Pr^  =  2t     -  t 

2p     2p    p 

which  is  free  from  bias  to  order  l/n^. 

Quenouille  worried  somewhat  about  loss  of  efficiency  in  a  procedure 
like  this,  but  stated  that  if  the  average  of  all  possible  sets  of  n-1 
observations,  t^_-j^,  is  used  in  place  of  t  _j^,  little  loss  of  efficiency 

should  result. 


Durbin  (6)  in  1959,  applied  Quenouille's  findings  to  ratio  estimators, 
finding  that  if  the  regression  of  y  on  x  is  linear  and  x  is  normal, 
that  Quenouille's  device  actually  decreased  the  varieuice.  The  estimate 
Durbin  considered  was 


where 

R  is  the  simple  ratio  estimate  from  a  sample  of  size  n 

R^,  R^P  eire  the  simple  ratio  estimates  from  the  two  halves 

of  the  sample. 
The  following  example  from  Deming  {k)   illvistrates  its  use. 


Characteristic 

Seimple 

1 

Sample 

2 

Both 

Total  Rent 

$2720 

$2350 

$5070 

Total  number  of  delinquents 

33 

31 

6k 

Average  rent 

$82.1l2 

$75.81 

$79.22 

R  =  2(79.22  -  1/2  (82.1+2  +  75.81) 
=  $79.33   . 

Durbin  (7)  also  considered  the  case  where  x  has  a  gamma  distribution 
eind  found  that,  although  the  variance  is  increased  by  using  R  ,  the  mean 
sqviare  error  is  decreased.  For  proofs  of  these  cases,  see  Durbin  (6). 

Kish,  Namboodiri,  and  Pillai  (15)  also  look  at  Quenouille's  results 
and  were  dissatisfied  with  it,  saying  the  degree  of  reduction  in  bias  didn't 
warrent  the  increased  cost  in  computation,  sind  that  there  were  no  practical 
methods  for  estimating  its  variance. 
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The  general  form  for  Quenouille's  method  as  applied  to  ratio  esti- 
mates was  discussed  by  Rao  (32).  This  form  is 


where 

R   is  the  usual  biased  ratio  estimator, 

R.  is  the  \isuaJ.  ratio  estimator  omitting  the  j-th  group, 

g  is  the  number  of  groups  of  equal  size  into  which  the 
sample  of  size  n  is  split. 
This  form  with  g=2,  reduces  to  the  estimator  R„  considered  by 
Durbin.  Rao,  assuming  the  regression  of  y  on  x  was  linear  and  that 

X  was  normally  distributed,  fovind  the  variance  of  R.   for  general  g 

-3 
to  order  n  .  He  showed  that  both  the  bias  and  the  variance  of  Ri 

were  decreasing  functions  of  g,  and  therefore  the  optimum  choice  for 

g  would  be  n.  The  estimate 


n-1 


n 


R^  =  nR,  -  ^^^^  y  R,  , 

may  be  preferred  to  others. 

R   is  the  estimate  obtained  by  omitting  the  j-th  observation. 

Tin  (38)  compares  Quenouille-based  estimators  with  others,  discussed 
later  in  this  paper,  but  also  considers  two  extensions.  One  extension  led 
to  the  same  resiilt  previously  considered  by  Rao, 
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He  states  that  as  g  is  increased,  the  variance  becomes  smaller  as  Rao 
proves  assuming  normality,  but  Tin  also  says  it  becomes  more  biased.  He 
also  states  a  condition  for  the  efficiency  of  R,   to  be  greater  than  the 
efficiency  of  R,.  . 


For 

k 

t 

X 
where 


"  >  12  (^) 


k_  is  the  ij  cumulajit  of  x  and  y.  R,  is  less  biased  and  more 
ij  ■'  k 

_2 
efficient  than  R   if  n  is  chosen  between  2  and  n  k  /X  .  For  a  dis- 
cussion of  this,  see  Tin  (38). 

Tin's  other  extension  was  to  divide  the  sample  into  two  halves;  and 
then  divide  each  of  these  fiurther  in  two  halves.  He  then  obtains  the 
estimator 

h   =  ^/3  R,  -  (R,,  *  R,2)  *   1/12  (Rill  -   Rii2  *   «121  *  «122^ 
where 

B.         is  the  \isu£il  ratio  estimate 

R^   is  the  usual  ratio  estimate  ceLLculated  from  the  J-th 

half  of  the  sample 

R  ,  is  the  usual  ratio  estimate  calculated  from  the  k-th 

half  of  the  j-th  half  of  the  sample. 

This  was  shown  by  Tin  to  be  less  biased,  but  also  less  efficient  than 
both  the  simple  ratio  estimator  and  Durbins  estimator.  As  Cochran  (2) 
mentions,  these  estimators  derived  from  Quenouille's  general  method  can  not 
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be  expected  to  be  of  help  when  small  saiaples  are  taken  within  strata, 
of  course  this  is  when  an  unbiased  or  reduced  bias  estimator  would  be  of 
the  most  help.  These  estimators  are  usefvil  however  in  another  respect, 
when  taking  only  moderate  samples  from  a  population  having  wide  variation 
in  the  x  variate. 

3.^.  Beale's  Estimator 
Beale  (l)  derived  an  asymptotic  expansion  for  both  the  bias  and  the 
variance  of  the  simple  ratio  estimator  in  terms  of  the  coefficient  of 
variation.  Using  this  he  obtained  the  following  estimator 


1  +  (1  _  1)  s(x.y) 

^n   n' 
R  =  R    ^-^ 


*n   N'  -2 

X 


vhere 


\  =  y/x, 


S(x,y)  =  sample  co variance, 

2 
S  (x)  =  sample  variance  of  x. 

This  estimator  removes  the  leading  term  in  the  bias  euid  also  decreases  its 
asymptotic  variance.  Beale  also  mentioned  that  the  extra  cost  is  negligible 
if  one  wanted  to  estimate  the  vcuriance,  since  the  above  quantities  are  needed 
for  this.  This  appeared  to  Tin  (38),  to  be  one  of  the  better  ratio-type 
estimators,  from  the  standpoint  of  degree  of  bias  and  efficiency. 

Tin  (38),  in  an  effort  to  reduce  the  bias  in  the  simple  ratio  estimate, 
developed  the  following  estimator 
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h'h  (^*(^-f'  (%^-%^)) 


X  y 

where  the  symbols  are  defined  as  in  R  .  This  has  the  same  general  form 
as  Scale's  estimator  when  neglecting  terms  of  order  l/n^.  This  estimator, 
also  less  biased  than  the  simple  ratio  estimator,  is  more  efficient,  a 
surprising  result  to  Tin.  He  proved  that  this  is  not  true  since,  by  con- 
tinually decreasing  the  bias,  there  is  a  point  when  the  estimator  starts 
becoming  less  efficient.  Tin  (38)  also  compared  R^  and  R^  above,  with 
R  ,  Durbin's  application  of  Quenouille's  method  to  ratio  estimates,  and  R^ , 

the  simple  ratio  estimate.  Tin  showed  that  Beale's  estimator  was  the  least 
biased,  followed  by  Tin's  modified  ratio  estimator  which  was  less  biased 
than  Quenoville's  method  as  applied  by  Durbin.  A  comparison  between  Durbins 
estimator  and  the  usual  estimator  has  already  shown  Durbin's  to  be  superior 
in  most  cases.  The  variances  were  then  compared  and,  to  order  1/n^  or  l/n^, 
the  modified  ratio  estimator  Rg  was  the  most  efficient  followed  by  Scale's 
estimator  and  then  Durbin's  estimator.  Tin  also  showed  that  there  is  little 
difference  in  their  approach  to  normality  in  large  samples,  but  for  small 
samples  (n=50)  the  modified  ratio  estimator  appears  to  be  the  best  in  regard 
to  bias,  efficiency,  and  approach  to  normality,  followed  by  Beale's,  Durbin's, 
emd  the  simple  ratio  estimator  in  that  order. 

Another  modification  of  R   was  obtained  by  Tin,  by  subtracting  an 
estimate  of  the  bias,  to  obtain  a  less  biased  but  also  less  efficient  esti- 
mator than  R„.  The  estimator  was 

«9  "  «1  (^  *  (^-  1^  (Slx^.  s!(|l)  ^,  _  3(i   Ij  (Sfixljj) 


lif 


where  the  symbols  are  as  defined  previously. 

An  estimate  of  the  variance  of  R^,   R  ,  R   or  Rq  to  order  l/n,  sup- 
plied by  Tin  (37)  is 


.2,    .        J2. 


X      y       X  y 

which  does  not  involve  much  extra  coniputation ,  since  S^(x)  and  S(x,y)  are 

needed  in  the  estimates,  R_  and  Rn. 

To 

3.5.  Jones  Method  For  Correction  of  Bias 
Jones  (lU)  wrote  about  a  graphic  procedure  used  by  Tukey  to  get  an 
estimate  of  the  bias  and  correct  for  it  by  using  replicated  samples.  Since 
the  bias  contains  the  factor  l/n,  it  is  obvious  that  as  the  sample  size 
increases  the  bias  decreases  rapidly.   If  it  is  inconvenient  in  some  way, 
or  costly  to  take  large  samples,  one  may  use  the  following  procedure  to  get 
an  estimate  of  the  ratio  one  would  obtain  by  increasing  indefinitely  the 
size  of  the  sample.  The  procedure  is  as  follows.  Divide  the  sample  into 
g  subsamples,  calculate  the  simple  ratio  estimator  for  each  of  the  g  sub- 
samples,  and  average  them.  Next  combine  the  g  subsamples  in  equal  groups 
of  size  m.  obtaining  g/m.  groups  for  each  choice  of  m. .  Find  the 
average  of  the  simple  ratio  estimator  calculated  for  each  of  the  g/m.  groups. 
To  illustrate  this  part  of  the  procedure,  let  us  consider  the  case  g=10. 
Here  the  possible  choices  for  m. ,  are  m^  =  2,  m^  =  5,  m^  =  10,  yielding 
5,2,  and  1  groups  respectively.  This  gives  i+1  average  ratios.  The 
second  step  is  to  plot  these  on  coordinate  paper  against  the  number  of  sub- 
sample  estimates  used  to  compute  the  average.  For  g=10,  the  averages 
would  be  plotted  10,5,2,  and  1  unit  away;  respectively,  where  the  length  of 
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the  unit  is  inunaterial.  The  third  step  is  to  draw  the  line  of  best  fit. 
Extrapolation  to  zero  gives  a  quick  estimate,  R  ,  of  the  ratio  one  would 
obtain  by  increasing  indefinitely  the  size  of  the  sample.  This  procedure 
should  also  be  useful  when  relationship  between  the  bias  ajid  the  reciprocal 
of  the  sample  size  is  not  linear.  An  example  of  the  use  of  this  process 
follows . 

A  sample  of  size  50  was  taken  from  a  pop\xlation  with  Y=UO,  X=80. 
The  sample  was  randomly  divided  into  10  subgroups,  y/x  was  computed  for 
each  of  the  10  subgroups  and  their  average  found.  The  average  for  5,2,  and 
1  subgroups  were  also  found  by  combining  the  10  subgroups.  The  following 
results  were  obtained. 

Table  1.  Sample  Data  for  Jone's  Graphical  Method 


.506 

»50U 
.502 
.500 
MB 


Average  for  10  groups 

M  II  c  II 

II  II  2  " 

II  II  -J  II 


-  .501+8 

-  .1+980 

-  .5009 


10 


Fig.  1.  Illustration  of  Jone's  Method 
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The  resulting  estimate  of  the  ratio  one  woxxld  obtain  by  increasing 
the  sample  size  indefinitely  is  .5000. 

3.6.  Murthey  and  Naiijamma's  Estimator 
Murthey  and  Nanjamma  (2l)  developed  a  technique  to  estimate  the  bias 
of  the  simple  ratio  estimate.  This  was  used  to  obtain  an  almost  unbiased 
estimate  by  using  a  correction  factor.  The  simple  ratio  estimate  is 

Rj,  =  y/x  . 
Another  biased  estimate  often  vised  is  the  average  of  the  sum  of  ratios 


n 
R 

i=l 


1  " 

^  =  -  I  y./x.  . 

n   n  >,''i  1 


This  latter  estimator  is  often  used  when  a  ratio  estimator  seems  appropriate, 
but  the  variance  of  y  doesn't  increase  linearly  with  x. 

Mxirthey  and  Nanjamma  (2l),  using  a  series  expansion  and  neglecting 
terms  of  degree  greater  than  two,  expressed  the  bias  of  R   as 

^1  "  ~2  (^^^^^  -  C(x,y))  ,     ■    ■   ■ 


€uid  the  bias  of  R   as 

n 


^n-rj/'V^'- 


where 


B(y. /x  )  is  the  bias  of  y./x, , 
1  1  'i'  i» 

V(x)  is  the  population  variance  of  x, 

C(x,y)  is  the  population  covariance  of  x  and  y. 
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n 


The  queuitity     B,    can  be  written, 

n 

*  n       1=1 

Therefore  to  the  second  degree  of  approximation 

^n  =  "  ^1' 
and 

^K  -  \)  =  \  -  ^  =  ("-i)\  '        . 

So  an  unbiased  estimate  of  the  bias  of  R,  to  the  second  degree  of  ap- 
proximation is 


R  -  R, 
_n 1 

n-1 


This  is  vised  to  correct  R   for  its  bias  obtaining. 


n  R  ■  -  R 
1    n 


R. 


11     n-1      • 
Another  estimator,  unbiased  to  the  third  degree  of  approximation  is 

2n  R-    n  R^      2  R 

R    =  i. 2        n 

"llA    n-1      n-2  *   (n-l)(n-2) 

where  the  sample  was  split  into  two  parts  and 

^2  ^  y^/^  +  y2''^2  » 


\  =  y/x  . 


«n  =  ^  I^iZ-i  . 
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Unbiased  Ratio-Type  Estimator 

The  fact  that  the  standard  ratio  estimators  used  are  biased  estimators 
has  led  to  the  exploration  and  development  of  unbiased  ratio-type  estimators, 
These  estimators,  though  having  the  desirable  properties  of  a  ratio  esti- 
mator are  unbiased.  Research  in  this  field  can  be  classified  into  two  broad 
categories.  The  first,  the  development  of  an  unbiased  estimator  through 
the  use  of  commonly  used  seunpling  schemes,  has  been  explored  by  Hartley  and 
Ross  (195^),  Robson  (1957),  Goodman  and  Hartley  (1958),  Mickey  (1959), 
Robson  and  Vithayasai  (1961),  and  Williams  (1961),  among  others.  The  second 
class  of  development  was  concerned  with  developing  and  modifying  certain 
sampling  schemes,  so  that  under  these  schemes,  the  usual  ratio  estimator 
becomes  unbiased.  Major  contributions  here  have  been  Lahiri  (1951), 
Midyuno  (1952),  Horvety  and  Thompson  (1952),  Raj  (195^),  Mickey  (1959), 
Nanjamma,  Murthey,  and  Sethi  (1960),  Williams  (1961),  and  Pathak  (1961*). 
Both  of  these  classes  will  be  reviewed  in  this  report  with  some  compeu-isons 
between  these  and  the  previously  mentioned  reduced-bias  estimators, 

k,      THE  UNBIASED  ESTIMATOR  (COMMONLY  USED  SAMPLING  SCHEMES) 

^4.1.  Hartley  and   Ross's  Estimator 
The  first  developments  in  unbiased-ratio- type  estimators  employing  com- 
monly used  sampling  schemes  were  by  Hartley  and  Ross  (12)  in  195't.  In  brief 
they  considered 

R  =  -ly./x.  , 

n   n  ^"'1  1  • 

one  of  the  standard  biased  estimators,  and  connected  it  for  bias  by  examining 
the  population  covariance  of  y/x  and  x.  * 
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Gov  (y/x,x)  =  E{y/x'x)   -  E(y/x)E(x) 


6uid  so 


E(y/x)  =  ^^  -  Gov  (y/x,x) 
^^^'""^   E(x)        E(x) 


=  Y/X  -  ^  Gov  (y/x,x) 
X 


Since 


E(R^)  =  E(y/x) 

the  bias  in  R   is  given  by  -  -  Gov  (y/x,x),  an  exact  expression.  An  vin- 

X 

bieised  estimate  of  this  covariance  is 

^^    hr,  -  -r)U,  -  -^)  '  ^    (9  -  -r  -^) 
Where 


^i  "  ^i/^i 


R  connected  for  bias  becomes 
n 


^2  ~  \      — ^ (y  -  r  x)  . 

^'^         '^   (n-l)N  X 


Hartley  and  Ross  (12)  gave  an  approximate  variance,  for  leu-ge  samples, 


as 


V(R^2)  =  i  (V(y)  +  r2v(x)  -  2RG  (x,y)) 
where 
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V(y)   =  population  variance  of  y, 
v(x)   =  popiilation  standard  deviation  of  x, 
C(s,y)  =  popxxlation  covariance  of  x  and  y, 
R  =  Y/X  .     . 

They  state  that  this  is  also  the  approximate  variance  of  R   if  terms  up 
to  and  including  the  quadratic  are  considered.  Therefore  they  conclude 
that  while  the  bias  is  eliminated,  the  variance  has  not  increased  to  any 
degree.  They  also  state  that  similar  results  for  bias  elimination  in  R- 
may  edso  be  applied.  If  this  is  done,  we  obtain 

C(R  x)  , 

An  exact  formula  for  the  variance  of  R  ^  is  given  for  any  size  sample 
by  Goodman  and  Hartley  (7)  if  the  finite  population  correction  may  be  omitted, 
as 

V(R^2^  =  -12  (^^y^  *   ^p  ^(^^  -  2R  C(x,y)  +  ^  (V(r)V(x)  +  C(r,x)}   ' 


nX 


where 


R  is  the  population  mean  of  the  R.  's  , 

V(r)  is  the  population  varisuice  of  the  r.  's  , 

C(r,x)  is  the  popvilation  covariance  of  r.   and  x.  . 
An  exact  formula  obtained  through  using  multivariate  polykays  was  obtained 
by  Robson  (3^). 

Goodman  euid  Haxtley  develop  an  extremely  cmnbersome  formula  for  an 
unbiased  estimate  of  the  popvilation  variance,  (see  Goodman  and  Hartley  (7)). 
In  the  same  paper  they  developed  a  much  simpler,  also  unbiased  but  with 
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larger  sampling  error,  estimate  of  the  population  variance  by  modifying  the 
sampling  scheme.  The  procedure  is  as  follows.  First  draw  a  reuidom  sample 
of  m  pairs  (x. ,  y. )  without  replacement,  then  replace  the  sample  and 
draw  ajiother  sample  of  m  pairs.  This  method  makes  the  two  sample 
independent,  whereas  the  random  splitting  of  a  sample  of  size  n=2m,  into 
two  heLLves  will  not.  If  the  two  samples  are  identical,  reject  the  second 
and  draw  another.   If  n  <  <  N,  the  two  sanqjles  will  usually  have  no 
elements  in  common.  The  estimator 

is  an  xinbiased  estimator  of  Y  and  an  \inbiased  estimate  of  the  VEU*lance 
of  Y^    is 

2  (!)  -  2 


s^(5.  )  =  r  (Y.    -  Y,  f     '"^ 


±  d  m 


where 


i 
^"^  *i'  ^i'  ^i  ^^®  ^^®  sample  means  from  the  i-th  sample.  The  unbiased 
estimate  of  the  variance  is  based  on  only  one  degree  of  freedom,  and  if  more 
degrees  of  freedom  are  desired,  k  samples  of  size  n/k  could  be  drawn. 
In  stratified  sampling  the  disadvantage  of  the  one  degree  of  freedom  is 
eliminated  to  a  certain  degree.  The  following  example  illustrates  the  use 
of  this  method.  In  this  example  N=UOO  and  X=2.  Two  samples  of  size  n=2 
were  drawn.  .. 


.-:%%. 
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X 

1 
2 


1st  Sample 


y 

3 
6 


r 
3 
3 


*1  "  •'■•^    ^1  "  ^'^  ^1  '  ^ 


2nd  Sample 


X  y        r 

1»  8        2 

1  2         2 

^9  =2.5  yo  =  5    ?o  -  2 


In  an  example  like  this,  the  finite  popxilation  corrections  (^)  and 


N 


\ji^i  -  2/Vju;     can  usually  be  replaced  by  1. 


Yr        -  2(3)  +  2(1*. 5  -  3(1.5))  =  6 

lit, 

1 


Yj,  =  2(2)   +  2(5  -  2(2.5))    =  h 

11*2 


Y        =6iU=5 


s2(Y       )  =  i  (6-11)2  ^  ^     ^ 
^llt  ^ 

This  example  was  due  to  Goodman  and  Hartley  (T). 

Goodman  and  Hartley  state  that  in  large  samples,  where  the  approximate 
formula  for  V(R^)  is  applicable,  V(R^)  will  be  smaller  than  V(E^^)   in  most 

cases.  Raj  (30)  showed  that  present  comparisons  are  not  valid  for  small 
samples  since  the  approximate  variance  formula  definitely  understates  the 
true  variance.   If  x  were  symmetrical  the  understatement  as  a  proportion 
of  the  approximate  variance  exceeds  three  times  the  relative  variance  of 
X  with  a  higher  underestimation  if  the  distribution  of  x  is  negatively 
skewed. 
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Goodman  and  Hartley  point  out  a  special  case  where  the  variance  of 
the  unbiased  estimator  is  always  smaller  than  the  usual  one.  This  is 
when  the  conditional  variance  of  r  given  x  is  decreasing  with  x, 
i.e.,  the  array  variance  of  r  decreases  with  increasing  x  in  the 
scatter  diagram  (x,r).  For  this  kind  of  data,  the  unbiased  ratio  esti- 
mator proposed  by  Hartley  and  Ross  (12)  is  better  than  the  simple  ratio 
method. 

Olkin  (23)  estended  Hartley  and  Ross's  estimator  to  the  case  where 
mult i- auxiliary  variables  are  used  to  increase  precision.  Considering 
the  case  of  p  such  auxiliary  variables  x,,  Xp,  ...,  x  ,  Olkin  developed 
the  estimator  ' 
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?    -  7  ^  (N-l)n  ,-    ?    -  -  . 
=  >  w.r.X.  +  ^7 — TV  (y  -  )  w.r.x.); 
.^^  111   N(n-l)   ^   ^   .1  1  i" 


an  unbiased  estimator  of  Y, 
where 


n 

V  /y 


"^i  =  X  ^i^^ 


i=l 
and  w.  is  chosen  to  minimize  the  variajice  of  R-ic*  Common  choices  of  w. 

would  be  l/x.,y  if  the  variajice  of  y  increases  with  the  square  of  x,  or 
1  if  the  veiriance  of  y  appears  to  increase  linearly  with  x.  For  a  full 
discussion  of  optimum  choices  of  weights,  see  Raj  (31). 

k.2     Robson's  Estimator 
Robson  (35),  in  1957,  applied  the  results  of  multivariate  polykays 
to  obtain  the  previously  mentioned  exact  variance  formula  for  Heirtley  and 
Ross's  unbiased  estimator.  He  also   obtained  Hartley  and  Ross's  estimator 


2k 


by  using  multivariate  polykays.  For  a  discussion  of  this  see  Robson  (35). 
In  this  same  paper,  Robson  adjusted  cinother  standard  biased  ratio  estimator 

^^  which  has  greater  precision  than  R,   or  R   if  the  correlation  is 
X 

negative  between  x  and  y,  to  obtain  a  corresponding  unbiased  estimator. 

The  bias  of  ^^-^  ,  an  estimate  of  7  is 
X 

E  (^  -  Y)  =  ^  (e(x  y)  -  X  Y)    . 
X        X 

=  -  Cov  (x,  y) 
X 

Therefore,  an  adjusted  unbiased  estimator  of  the  ratio  is 


n 


"16  ^  ^  -  ^  •  fell}  J,<^  -  ^>  ^i 


or 


n 

I  \  y- 

„  1     rn(N-l)  -  -         N-n         i=l         ^-> 

^16  =  ^2      ^N(n-l)  ""^  ~  N(n-l)  n       J      * 

Again  using  multivariate  polykays,  Robson  (35)  found  for  an  unbiased  esti- 
mate of  the  variance  of  R-,/-.  as  the  sample  size  becomes  large, 

2 

s2  (R  ^  =  _iL   S^(y)  +  S^(x)  +  2  ^<^»y)  +  -^  (S^(x)  S^(y)  *   fs(x,y)l 

^  l6'     ^2-2      -2       -  -     n-1  ^       -2  -2 
nXy      X        xy  xy 

This  was  obtained  by  substituting  the  above  sample  estimates  for  population 
values  in  the  population  variance. 
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1*.3.  Mickey's  Estimator 

Mickey  (19)  developed  a  method  for  producing  a  broad  class  of  unbiased 
ratio-type  estimators,  by  using  the  fact  that  y  -  a(x-X)  is  an  unbiased 
estimator  of  Y  for  any  choice  of  a.  He  also  used  the  fact  that  for  any 
choice  m    of  the  n  sampling  units,  the  n-m  remaining  units  can  be  con- 
sidered a  reuidom  sample  of  n-m  from  the  N-m  units  derived  by  omitting  the 
m  given  units.  Mickey  then  chooses  a  as  a  function  of  the  m  selected 
units  and  uses  y  -  a(x-X)  to  get  ein  \inbiased  estimate  of  the  pop\ilation 
of  N-m  xinits  which  leads  to  an  unbiased  estimate  for  the  whole  population 
by  utilizing  the  relationship  between  the  two  populations  determined  by 
m,  N,  and  the  m  selected  units.   Since  y  is  a  biased  estimator,  a(x-X) 
is  an  estimate  of  the  bias  obtained  by  using  the  form  of  the  biased  esti- 
mator to  the  subsample  in  estimating  the  sample  mean,  y.  Mickey  uses  the 
following  formula  to  generate  his  estimators. 

R  =  a(Z  )X  +  T7^  fY(n)  -  a(Z  )x(n))  -  -7^  (Y(m)  -  a(Z^)X(m)) 
m     m     N(n-m;   »•  ^  '      m     ■*   N(n-m)  ^         m     ^ 

where 

Z  is  the  ordered  set  of  observations  on  the  first  m  seunple  elements 
m 

1  <  m  <  n,  a(Z  )  is  a  function  of  these  observations  to  be  determined,  X(m), 

—  m 

Y(m)  are  the  sums  of  the  first  m  sample  elements,  X(n),  Y(n)  are  the 
sample  totals.  Particular  estimators  are  generated  by  the  choice  of  a(Z  ), 
and  a  general  class  of  estimators  is  constructed  by  including  all  estimators  . 
of  the  form  above  applied  to  any  permutation  of  the  ordering  of  the  ssunple, 
weighted  averages  of  such  estimators,  ajid  estimators  obtained  from  subsanples 
of  the  given  sample.  A  knowledge  of  the  population  one  is  sampling  from 
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helps  in  choosing  fvinctions.  When  the  variance  in  y  increeises  as  the 
square  of  x,  Mickey's  techniques  lead  to  the  estimator,  R, pt  Hartley  and 
Ross's  estimator.  When  the  variance  increases  linearly  with  x,  Mickey's 
estimator  is 

R   ^  y(m)  ^   (N-m)n  ^-  _  y(m)  .  -^ 
^^       x(m)   N(n-m)X      x(m) 

where 

y(m),  x(m)  are  sample  means  of  the  first  m  observations.  For 
in=n-l,  R^_  becomes 


g      (N-n^l)n  (-_g   -^ 
^®    °-^     N(X)    ^   °-^^ 


where 


n  nj^^^ 

Vl   n  /,  - 

J=l  nx  -  X 

Mickey  goes  on  to  develop  another  estimator  for  which  he  also  develops 

an  easy  formula  to  estimate  its  veuriance.  Let  R(m,n)  denote  an  estimator  R 

m 

based  on  a  sample  of  size  n.  Suppose  also  there  are  k+1  integers 
0  <  m^  <  ...  m^,  =  n,  and  consider  the  k  estimators 

RCm^.m^),  RCm^.m-),  ...,  R(mj^,n).  The  estimator  Mickey  developed  was 

1  ^ 
^9  =  k  ^^^('"j'  ""j+i^  • 

He  states  an  unbiased,  non-negative  estimator  of  the  variance  is 
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2         ,      k 
S 


%^=k(fciT    I    ^^^'"j' Vi^-S^ 


There  is  a  great  deal  of  flexibility  since  the  R(m  ,  m   )  may  be  chosen  as 
Hartley  and  Ross's  estimator,  R  _,  R  g,  or  other  similar  estimators.  The 
precision  of  R  _  could  be  improved  by  averaging  with  respect  to  a  remdom 
sample  or  all  possible  orderings  of  the  sample  elements.  To  clarify  the 
previous  discussion  two  examples  will  be  considered. 

Example  1.  The  first  example  involves  a  table  constructed  by  Cochran  (2,  table 
6.l).  He  gives  values  of  x  and  y  for  1+9  cities,  where  y  is  the 
number  of  inhabitants  of  a  city  in  1930  and  x  is  the  corresponding  ninnber 
for  1920.  The  vinit  of  count  is  1000  individuals.  A  random  sample  of 
size  5  was  selected  and  R  _  was  calculated  using 

N  -  i+1 


R(mj,  mj^^)  =  R(i-l,i)  =  R(i-l)  +  ^^~^  (Y(i)  -  R(i-l)X(i)) 


and 


k 

R   =  1  I  R(J-1,J) 
19   k  J=l 


where 


j=i  ^ 

X(i)  =  f  X 


R(i)  =  Y(i)/X(i) 


The  five  elements  sampled  in  the  order  drawn  were:   (63,37),  (58,50), 
(80,76),  (53, U5),  and  (113,121). 
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Table  2.   Illustration  of  Computations  for  Estimator  R 


19 


Y(i) 


X(i) 


R(i-l) 


Y(i)-R(i-l)X(i) 


N-i+l 
N 


R(i-l,i) 


1 

63 

3T 

2 

121 

87 

1.7027 

3 

201 

163 

1.3908 

k 

251+ 

208 

1.2331 

5 

367 

329 

1.2212 

-27.135 

-25.700 

-2.1+85 

-3l».775 


.9796 

l.Ul«98 

.9592 

1.1518 

.9388 

1.2105 

.918U 

.9116 

R^^  =  1»^^?8  ^  1.1518  ^  1.210^  ^  .^116  ^  ^  .^QQ^ 


s2(R  )  =   (l.UU98)^  ^  •'•:;\-^  (.9116)^  -11(1.1809)^ 


19' 


T(3T 


=  .0119923 


S(R^^)   =  .1095 


Example  2.  This  time  the  popxilation  is  the  entire  I96  cities  considered  by 
Cochran  suid  the  sample  is  the  k9   cities  listed.   Con5)utation  can  be  les- 
sened by  using  m   equals  some  number  larger  thein  1.  Choosing  k«=U  as 
in  the  previous  example,  let  nL=5,  nig''^^'  "3'' 31,  in.=l»l,  m  =1*9.  These 
are  strictly  arbitrary. 
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Table  3.   Illustration  of  Computations  for  Estimator  R 
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•1 

m. 

1 

Y(m.) 

X(m.) 

Y(m.)-R(m^_^)X(m.) 
R(m._^) 

«-i-l 

R(m^_^,  m^) 

N(m^-m.  ^) 

1 

5 

80U 

691 

2 

19 

3103 

2522 

1.163531  168.57^818 

.096606 

I.U9U78 

3 

31 

kl3k 

3368 

1.2303736  10.103736 

.075255 

1.23687 

k 

Ul 

533U 

U306 

1.233373   23.095862 

.O8U18U 

1.25000 

5 

Jt9 

6262 

5051+ 

1.238737    1.1*23302 

.098852 

I.2399U 

R^^  =  1.305397 

s2(R^^)  =  (l.W8)^^...^^^(l.2399W^-Ml.305397?  =  .136881 


U.U  Robson  and  Vithayasai*s  Estimator 

Robson  and  Vithayasai  (36)  develop  a  more  efficient  estimator  for  certain 
types  of  populations  by  using  Hartley  and  Ross's  correction  for  bias.  The  type 
of  population  vinder  consideration  was  when  x  and  y  could  be  expressed  a^ 
the  8xm  of  k  corresponding  components ,  and  when  the  components  were  more 
highly  correlated  than  x  and  y.  In  this  case  a  componentwise  ratio  esti- 
mator such  as  . 

k 


j=i 


J'  J 


is  generally  more  efficient,  although  it  is  biased.  By  using  Hartley  and 
Ross's  estimator,  Robson  and  Vithayasai  obtained  an  imbiased  componentwise 
ratio-type  estimator 


J=l  ^        (nj  i)Xj   -^    ^     ^ 
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where 

r  ,  X  ,  y  are  the  means  of  the  k  components, 
J   J   0 

N  is  the  population  size  of  the  J-th  components, 
J 

X.  is  the  population  mean  of  the  J-th  component. 

An  example  for  its  use  from  general  sample  survey  theory  is  the  ctise  of  cluster 

sampling  with  post  stratification,  x  representing  the  number  of  elements  in  a 

cluster  and  y  the  cluster  total  for  some  measured  character.   If  the 

X  elements  in  a  randomly  chosen  cluster  are  partitioned  into  k  strata  of 

size  X.,  X  known,  then  the  above  estimator  may  be  much  more  efficient  thcui 
J   J 

the  non-stratified  estimator. 

U,5.  Willimas'  Estimators 

Willieuns  (39)  considered  the  generation  of  some  unbiased  ratio  and 
regression  estimators,  differentiating  between  the  two  as  follows.  He 
classified  sji  estimator  as  a  regression  type  if  it  was  invariemt  under 
location  and  scale  changes  in  x  and  if  it  underwent  the  same  location  and 
scsLle  changes  in  y.  He  classified  an  estimator  as  a  ratio  type  if  the  above 
properties  hold  for  scale  changes  only. 

The  following  procedure  was  considered  by  Williajns.  First  he  selected 
with  equal  probability  one  of  £ill  possible  splits  of  the  population  into  s 
groups  of  size  n/k,  N  =  Sn/k.  Second  he  selected  at  reindom  without  replace- 
ment k  of  the  groups  from  the  s  groups  of  that  split,  yielding  a  sample 
of  size  n.  Williams  considered  the  conditional  distribution  for  a  particuleir 
set  of  s  groups,  eventually  deriving  the  unconditionally  unbiased  estimate 
of  R 


"21 


A  1=1 
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where 

X  is  the  mean  of  the  n/k  units  in  the  i-th  group 

b.  is  as  yet  unspecified  function  of  the  y  and  x  of  the  i-th 
group,  to  make  R-^  a  ratio  estimator. 

k 

S  =  I  h  /k 

i=l 

This  approach  insures  that  R^^  is  oa   unbiased  estimator  for  any  choice  of  the 
b,. 

In  practice  a  sample  of  size  n  is  taken  and  split  randomly  into  groups. 
Willieuns  states  that  this  alao  preserves  the  unbiasedness  of  the  estimator. 
For 

n/k       n/k 


^  -  ,1  Vii'l^i  • 


Williams  gets 


R22  =  - 


n/k 
,    ,         ,     k     n/k  n/k       _  „         n  k     //iJ^iJ     ^/^ 


J=l 


n/k 

^  iii  V   2  ^  • 


1=1  W^ 


J=l     ^J 


For 


b.   =  V  /x     =  r  . 
i       ^i'   i         i' 
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R   becomes 


„     rx  .  1  N(k-n)   ,  — v 


X   X 
When  k=n,  R-^  is  identical  to  Hartley  and  Ross's  unbiased  ratio  estimator. 

For 

n/k 
b  =  r  =  k/n  I     r 
J=l  ^'^ 


''ij  °  ^ij^^ij 


^  =  ?  =  r  I  r.. 

^i=l^ 

Williams  again  gets  Hartley  sind  Ross's  estimator  upon  substitution  into  R  , 
when  averaged  over  all  possible  splits  of  the  sample  into  groups  of  size  n/k. 
For  clarification  of  Williams  estimator  a  simple  example  follows. 

A  simple  random  sample  of  four  pairs  (y,,x. )  were  drawn  from  a  popu- 
lation of  size  100  with  X  =  2.0.  The  sample  was  split  randomly  in  2  groups; 
(2,1)  and  (3,2)  in  the  first  group,  (l,l)  and  (U,2)  in  the  second.  For  R^j, 
we  have  the  following 

^1-1^-^-^ 


^2  -1^=^-^ 


b>  - ' 


h  = ' 
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^22  "  |U2.5+|(l.6+1.8)(2-1.5))+3|f  •  |((1.6)3+1.8(3))-^1.6+1.8)} 
»|{(3.35)  +3^(10.2)  -  (2.5)(1.7)} 

=  |(3.996)  =  1.998  . 

5.   THE  UNBIASED  ESTIMATOR  (MODIFICATION  OF  SAMPLING  SCHEMES) 

This  section  will  be  concerned  with  a  presentation  of  various  seunpling 
schemes  and  modification  of  sampling  schemes  to  make  the  ordineuiy  simple 
ratio  estimators  vinbiased.  Theoretical  results  will  be  minimized  to  clarify 
the  actual  methods  in  the  following  section. 

5.I.  Lahiri's  Methods 

Lahiri  (l8)  in  1951,  showed  if  a  sample  was  drawn  with  probability- 
proportional  to  the  sum  of  the  x  elements  in  the  sample,  the  ordineiry  ratio 
estimate  y/x  was  unbiased.  An  exact  result  would  involve  forming  cumulative 
totals  for  all  possible  sanrples  of  size  n,  an  almost  impossible  task  in 
most  cases.  Lahiri  then  developed  some  procedures,  which  while  yielding  an 
\mbiased  estimator,  involved  procedures  which  greatly  reduced  the  amoaint  of 
work  in  saiq)ling.  The  first  was  drawing  a  san^jle  of  size  n,  unit  by  unit, 
when  the  largest  x  value  is  known.  This  involves  sampling  proportional 
to  the  X  values.  To  select  the  first  unit  in  the  sample,  choose  a  random 
value  between  0  eind  x^^^^  ,  the  largest  value.  Now  choose  at  random  one  of 
the  units  in  the  population.  If  it  is  greater  than  or  equal  to  the  random 
value  chosen,  retain  it;  if  not,  reject  it.  In  either  case  a  new  random. 
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value  is  chosen,  ajid  a  new  unit  is  chosen  from  the  population  each  time 
until  a  sample  of  the  desired  size  is  chosen.  This  process  results  in  a 
sample  of  size  n  proportional  to  the  x's.  The  unbiased  estimator  is 


«2l*  =  «n  =  ^.I  .V^    • 


i=l 

The  variance  of  this  estimator  under  this  sampling  scheme  was  given  'by 
Raj  (27)  to  be 

and  estimated  by 


s'(«2u^  =  iTifcrr  I  (^i/^  -  «2u)'  • 


This  sampling  procedure  can  involve  majiy  rejections,  which  may  be  costly. 
To  reduce  the  number  of  rejections,  Lahiri  considered  several  alternative 
schemes.  The  first  involved  using  some  large  \mit  x  max.  Now  a  unit  is 
chosen,  say  x,  .  If  x,  is  larger  than  x  max,  keep  it  and  look  at 

X  /x  max.  =  Q+R  where  Q  is  an  integer.  The  unit  is  listed  1+Q  times,  the 
first  of  size  R,  the  rest  of  size  x  max.  An  alternative  device  may  be 
used  if  there  eire  a  small  number  of  extraordinarily  large  sizes  and  it  con- 
sists of  dividing  the  population  into  two  groups,  one  made  up  of  the  large 
\inits,  the  second,  the  remaining  units.  A  set  of  three  random  numbers  is 
utilized  which: 

(1)  decides  which  group  the  selection  is  to  be  made  from, 

(2)  fixes  the  unit  which  is. to  be  accepted  or  rejected  on  the  basis 
of  three, 

(3)  chooses  the  random  value  between  0  and  x  max. 
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The  second  type  of  procedure  Lediiri  employed  was  to  choose  the  entire 
sample  with  probability  proportional  to  the  sum  of  the  observations  of 
X  in  the  sample,  ^  x. .  His  practical  method  was  to: 

(1)  choose  a  set  of  n  elements  at  random  (with  or  without  replacement) 
and  find  ^  x. , 

(2)  choose  a  random  value  between  0  and  ^  x.  =  say  V, 

(3   now  choose  another  sample  and   if  J]  x.  for  this  sEimple  is  greater 
than  or  equal  to  V,  keep  it.  If  ^  x.  is  less  than  V,  replace  it 
eind  begin  the  process  anew.  Find  another  random  number  V  and 
draw  smother  sample,  until  the  sample  satisfies  the  criterion. 
The  estimator  used.by  Lahiri  in  this  case  was 

R25  "  ^1  "  ^/^  • 
Raj  (27)  in  his  investigation  of  Lahiri 's  procedure,  derived  the 

variance  of  Rpj.  as 


J 


•25'    -   ^         '  -« 

I    -  I    .1      I  /A.  I 


where  J,'  denotes  summation  over  eill  possible  seunples;  (^y.)   ,  (Jx.)   are 

totals  of  the  J-th  sample.  He  euLso  obtained  an  unbiased  estimate  of  the 
variance  as 


n 

2,      ,        2    UJ   ^^i      J^^=^ 

s^«25^  =  «25 - t:^  Tifer*^— 7nI2 — 


i^     G)        CD 
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5.2.  Midzuno's  Method 

Midzuno  (20)  and  Sen  have  independently  given  a  simple  procedure  for 
obtaining  a  sample  with  probability  proportional  to  size,  thereby  making 
the  simple  ratio  estimate  y/x  unbiased.  Their  method  involved  the  fol- 
lowing procedure 

(1)  Select  the  first  unit  in  the  sample  with  probability  proportionsJ. 
to  size  as  follows:  Choose  a  random  number  between  0  and  the  largest  x  value, 
now  choose  a  random  x  value.  If  it  is  greater  than  or  eqvial  to  the  random 
number,  keep  it;  otherwise,  start  the  proced\ire  again. 

(2)  Select  the  rest  of  the  sample  with  equal  probability  without 
replacement  from  the  remaining  units  of  the  population. 

The  following  proof  showing  that 

is  unbiased  for  this  procedure  is  due  to  Cochran  (2). 

The  probability  that  a  san5)le  of  size  n  with  a  fixed  value  of  ^  x.  is 
drawn  is 


P  =    ^ 


^n-l' 
since  the  total  of  I   x  added  over  all  simple  random  saiq)les  of  size  n  is 

r^}  X.  '       . 

^n-l-* 

-  -   ^^i 
For  the  estimator        y/x  =  — =■ 

L    3_ 


E(y/x)  =  I         (P)  (li.) 
.all  S      Ix. 
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where  I  represents  a  summing  over  all  possible  simple  random  samples 

all  s 

E(y/x)  =  I  Vi    — 

all  s   ("-^)X  Ix. 
^n-1'   *-  i 


showing  y/x   is  unbiased  for  this  method  of  selection. 

An  unbiased  estimate  of  the  variance  of  R^^  was  given  by  Nanjamma, 
Murthey,  and  Sethi  to  be 


hl^^Bi.^ 


Nn  xX 


i^J 


They  edso  state  that  the  efficiency  of  the  unbiased  estimate  will  be  greater 

—2  —     < 
than,  or  equeQ.  to,  or  less  them  correlation  coefficient  of  (y  /x,x)  —  0  . 

5.3.  Nanjamma,  Murthey,  and  Sethi's  Methods 

Nanjamma,  Miirthey,  and  Sethi  (22)  in  I960,  modified  many  of  the  selection 
procedures  commonly  used,  equal  probability  sampling,  varying  probability 
sampling,  stratified  sampling,  and  multi-stage  seimpling  to  make  the  usual  '  ■ 
simple  ratio  estimator  unbiased.  The  procedure  is  similar  to  other  methods 
considered  previously,  that  is,  selecting  one  unit  with  probability  pro- 
portional to  size  of  the  correlated  x-variable  and  the  remaining  units  ac- 
cording to  the  original  scheme  of  sampling.  Variance  estimators  were  given 
by  Nanjamma,  Murthey  and  Sethi  for  some  of  the  more  in5)ortant  sampling  schemes. 
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UNSTRATIFIED  SAMPLING  WITH  EQUAL  PROBABILITY  AND  WITH  REPLACEMENT 

(1)  Select  one  vinit  with  probability  proportional  to  size  of  the 
X  variate,  using  Lahiri's  (l8)  or  Midzuno's  (20)  method. 

(2)  Select  the  rest  of  the  sample  with  equal  probability  with 
replacement. 

Then 

R27 "  \  =  y/^ 

is  £ui  unbiased  estimate  of  R  =  y/x.     The  probability  of  getting  a  particular 
san5)le  was  shown  by  Nanjamma,  Miirthey  and  Sethi  to  be 


P(S)  =  -^  '  ^ ^ 

n"         I     L^I        X 

where  L  is  the  number  of  repititions  of  the  i-th  linit  and  v  is  the  number 
of  distinct  units  in  the  sample.  The  estimated  variance  of  Rp„  was  given 
to  be 

S2(R   )  =  r2  .  -2 ii] 

^'     ^'  n(n-l)x3^ 


UNSTRATIFIED  SAMPLING  WITH  EQUAL  PROBABILITY  SYSTEMATICALLY 

Here  the  authors  considered  each  unit  as  made  up  of  n  sub-unit  of  the 
i-th  tinit  having  the  size  X./n  where  X.  is  the  total  of  the  i-th  vinit.  Now 
a  sub-unit  is  chosen  with  probability  proportional  to  size  of  the  x  values. 
The  others  are  then  determined  by  proceeding  to  select  the  remainder  of  the  sample 
systematically  with  the  sub-unit  selected  first  as  the  random  start.  The 
probability  of  a  particular  sample  s,  is 


39 


P(S)   =  x/x 
euid  an  xuibiased  estimator  of  the  population  ratio  is 


^28  ~  ^^^   ' 


Nanjamma,  Murthey,  and  Sethi  state  that  it  is  impossible  to  get  an  vinbiased 
estimate  of  the  population  variance  from  a  single  sample. 

VARYING  PROBABILITY  SAMPLING  PROBABILITY  PROPORTIONAL 
TO  SIZE  WITH  REPLACEMENT  SCHEME 

(1)  Select  first  one  unit  with  probability  proportional  to  x  and 
replace  it. 

(2)  Select  the  rest  of  the  sample  with  probability  proportional  to 

Z  with  replacement,  where  Z  is  some  measure  of  size  vinder  consideration. 

An  unbiased  estimate  of  R  is  then  given  by 

1/n  I  yi/Pi 

^29  "      n 

1/n  I     X  /p 
1 

where 


1=1 
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VARYING  PROBABILITY  SAMPLING  PROBABILITY  PROPORTIONAL 
TO  SIZE  WITHOUT  REPLACEMENT  SCHEME. 

This  is  in  general  not  a  practical  scheme  since  it  involves  very  heavy 
computations  but  two  specied.  cases  were  considered  by  Nanjamma,  Murthey,  and 
Sethi.  The  first  involves  a  sample  of  size  two,  the  first  element  taken  with 
probability  proportional  to  x,  the  second  probability  proportional,  to  Z,  An 
unbiased  ratio  estimate  is  given  as 


R-Q  = 


where 


P^  =  x^/X  , 


p^  =  yi' . 


The  second  involved  the  first  two  steps  above,  and  then  drawing  n-2  other 
elements  with  equal  probability,  thus  obtaining  a  ratio  estimate 


D  y.- 


n   X.     n 

Kn^)  (I  pj) 
i=i  ^  ^i    j=i  ^ 


which  is  unbiased. 
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6.   CONCLUDING  REMARKS 

Since  there  has  been  little  discussion  in  this  report  on  extensions  of 
the  ratio  estimators  considered  to  sampling  schemes  other  than  simple  random 
sampling,  a  brief  list  of  the  more  important  papers  in  certain  areas  of 
sampling  follows.  The  interested  reader  is  referred  to  these  articles. 

In  the  area  of  two-stage  and  multi-stage  sajnpling,  unbiased  ratio-type 
estimators  have  been  investigated  by  Nanjamma,  Murthey,  and  Sethi  (22), 
Pathak  (25),  Raj  (27),  Raj  (28),  Raj  (29),  Sukhatme  (37),  and  Williams  (39). 
Although  many  of  the  estimators  can  be  directly  applied  to  stratified 
sampling  schemes,  for  a  more  extensive  discussion  of  these  techniques  see 
Raj  (27),  and  Williams  (39).   For  a  discussion  of  unbiased  ratio-type 
estimation  applied  to  systematic  ssunpling,  see  Nanjamma,  Murthey,  €ind 
Sethi  (22).  Since  this  report  has  been  concerned  primarily  with  a  single 
variate  correlated  with  the  variate  of  interest,  the  reader  is  referred  to 
Olkin  (23),  Raj  (31),  and  Williams  C+O)  for  use  of  mult i- auxiliary  information. 

Although  Tin  (38)  has  made  a  fairly  thorough  comparative  study  of 
several  of  the  reduced-bias  estimators,  there  seems  to  be  little  available 
to  the  reader  interested  in  a  more  extensive  comparison  involving  the  ususil 
biased  estimators,  reduced-bias  estimators,  and  both  classes  of  unbiased 
ratio-type  estimators.  One  of  the  major  reasons  is  that  some  of  the  V8a*iance 
formulas  involved  are  not  known,  and.   some  are  only  large  sample  approxi- 
mations. Exact  expressions  for  variances  are  usually  mathematically  cumber- 
some and  difficult  to  compare. 

The  following  study  involves  three  small  populations  (n=6)  with  samples 
of  size  (n=k)   taken  from  each.  All  possible  samples  were  taken  from  each 
pop\ilation,  so  the  bias  and  variance  could  be  found  exactly  for  each 
population. 
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Table  h.     Computer  Study  One 


Population  1.   (0,2),  (l,3),  (2,5),  (U,9),  (Q.l^*),  (9,15)i  X  =  8.0 


Estimator 

Bias 

Variance 

M«  0  •  £«• 

y 

0.0000 

1.1666 

1.1666 

h 

0.0627 

.1203 

.12J42 

R 
n 

0.8673 

.2571 

1.0033 

^ 

0.0195 

.1152 

.1156 

h 

0.1168 

.IOU6 

.1182 

hi 

0.2056 

.1235 

.1657 

h2 

0.0000 

.2328 

.2328 

h^B 

0.2166 

3.5129 

3.5598 

h, 

0.0000 

3.U36U 

3.U36U 

^ 

.3216 

.1217 

.2251 

^7.^19 

0.0000 

.0711 

.0711 

«22 

0.0000 

.0220 

.0220 

U3 


Table  5.   Computer  Study  Two 
Population  2.   (5,1),  (1*,2),  ('♦,5),  (10,8),  (l2,ll),  (l6,15);  X  =  7.0 


Estimator  Bias  Variance  M.S.E. 


y  0.0000  2.0583  2.0583 

R  0.1327  .5'+52  .5628 

R  U.5755  10.2863  31.2215 
n 

R  O.02U8  .kkQQ  .'♦5U9 

R„  0.1798  .2186  .2509 

R  1.3it82  .327U  2.1U50 

R  0.0000  2.592I+  2.592il 

R,^^  O.30U7  10.5797  10.6727 

R  0.0000  10.2666      ,  10.2666 
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Table  6.  Computer  Study  Three 


Population  3.   (0,0),  (l,l),  (U,2),  (9,3),  (l6,U),  (25,5):  X=  2.5 


Estimator 


y 


^8 


R 


11 


12 


R 


R 


15B 
15 


R 


22 


Bias 

Variance 

0.0000 

7.9139 

0.1893 

1.6185 

2.9166 

1.8229 

.0783 

1.6311 

.276U 

1.7133 

.7191 

2.0253 

0.0000 

2.9167 

.5833 

22.1271 

0.0000 

21.5059 

0.0000 

1.8860 

M. S .E . 


7.9139 
I.65U3 

IO.329U 

1.6372 

1.7897 

2.5'*2U 

2.9167 

22  Mil 

21.5059 

1.8860 
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Ratio  estimators  have  been  used  quite  extensively  in  sample  surveys, 
not  only  as  estimators  of  popiilation  ratios,  but  as  estimators  of  popiilation 
means  and  totals.   It  has  been  demonstrated  that  in  a  great  many  situations 
the  ratio  estimator  has  a  smaller  variance  than  the  traditional  mean  per  unit 
estimator.  A  major  drawback  to  the  ratio  estimator  is  the  fact  that  it  is 
biased,  although  in  large  samples  it  has  been  demonstrated  that  the  bias  is 
negligible.   In  very  stoall  samples,  or  even  moderate  samples  from  a 
stratified  population,  no  really  convincing  argument  has  been  given  for  the 
negligibility  of  the  bias,  since  no  exact  expression  for  it  is  available. 
Several  authors  have  avoided  this  question  of  bias  by  developing  methods 
which  eliminate  the  bias  while  retaining  the  essential  properties  of  a  ratio 

estimator. 

This  report  reviews  the  usual  ratio  estimator,  giving  optimum  conditions 
for  its  use.  The  bias  is  approximated  and  limits  for  the  bias  are  given,  as 
well  as  cases  that  might  arise  in  which  the  bias  might  become  an  important 
factor.  Methods  are  then  considered  which  give  rise  to  reduced  bias  esti- 
mators, as  well  as  unbiased  ratio-type  estimators.  The  reduced  bias  esti- 
mators involve  the  use  of  expansions,  approximations  and  a  graphical  method 
to  obtain  reduced  bias  estimators.  The  latter  estimators  are  divided  into 
two  major  classes  of  development,  (l)  the  elimination  of  bias  through  the 
use  of  commonly  used  sampling  schemes,  and  (2)  the  elimination  of  bias 
through  the  use  of  certain  modifications  of  sampling  schemes  making  the 
usual  biased  estimator  unbiased. 

Finally  a  small  computer  survey  is  presented  in  which  several  of  the 
estimators  are  con5)eLred  with  respect  to  bias  euad  efficiency. 


